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Abstract 



We analyze the statistical dependency structure of the S&P 500 constituents in the 4-year period from 2007 to 2010 using intraday 
data from the New York Stock Exchange's TAQ database. With a copula-based approach, we find that the statistical dependencies 
are very strong in the tails of the marginal distributions. This tail dependence is higher than in a bivariate Gaussian distribution, 
which is implied in the calculation of many correlation coefficients. We compare the tail dependence to the market's average 
correlation level as a commonly used quantity and disclose an nearly linear relation. 



(N 

H 1. Introduction 

a 

The measurement of statistical dependence is often broken 
down to the calculation of a correlation coefficient, such as the 
Pearson coefficient [1] or the Spearman coefficient [2|. Cor- 

! ^relation coefficients are widely used in various disciplines of 

^— h science. It is also often included in financial modeling, e.g., in 
iyy the Capital Assets Pricing Model (CAPM) Q or Noh's model 

The usage of the correlation coefficient, however, suggests 
i a the linear statistical dependence and that the observables are 
i 'nearly normal distributed. Due to the central limit theorem, 
this might be justified in some cases, but often the statistical 
^NJ dependence is much more complex. In these cases, the statisti- 
*■ cal dependence cannot be represented by a single number. The 
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joint probability distribution, of course, holds all information of 
the statistical dependence. Certainly, the joint probability dis- 
tribution also contains the individual marginal probability dis- 
tributions. These can have different shapes depending on the 
underlying process. The statistical dependence of different sys- 
tems usually cannot be directly compared with this approach. 
t— I Copulae, first introduced by Sklar in 1959 [5, 6|, permit 
^ a separation between the pure statistical dependence and the 
• *h marginal probability distributions. This allows to compare the 
y\ statistical dependence of diverse systems. 

The usage of copulae is well established in statistics and fi- 
nance; There are many classes of analytical copula functions 
that meet various properties |7|. Several studies of financial 
markets are devoted to developing suitable copluae or fitting 
existing ones to empirical data (8]|9][lpj or are based on a small 
subset of assets ifTTI . In this study, we chose a different ap- 
proach. We perform a large-scale empirical study to disclose 
the structure of the average pairwise copula of the US stock 
returns. As the copula does not depend on the shape of the re- 
turn distribution, we are able to average over the copula of dif- 
ferent stock pairs although their marginal distributions' shape 
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may differ, i.e., exhibits stronger or weaker tails. In particular, 
we study the intraday stock market returns of the 428 continu- 
ous S&P 500 constituents in 2007-2010 based on intraday data 
from the New York Stock Exchange's TAQ database. 

2. Copulae 

The basic concept is simple: Let a and b be two random 
variables with probability densities f a (x) and fb(x) and cumula- 
tive distributions F a (x) and F b (x), with 



/ 



Ux) = 1 



A 

F a (x) = J f a (x') dx , 



(1) 



(2) 



and analogously for b. Further, let f a ,b(x,y) be the joint proba- 
bility density and F a j,(x,y) be the joint cumulative distribution. 
The inverse cumulative distribution function F~ l is the called 
the quantile function. For example, F~ l (0.05) represents the 
value which 5% of all random samples are smaller or equal to. 
This evidently gives, 



F a (F- a \a)) = a. 



(3) 



F 1 (a) is also called the a-quantile. The copula Cop ab (u, v) is 
defined as the cumulative joint distribution of quantiles, 

Cop ab (u, v) = F aJ> (F-\u), F b l (v)) . (4) 

The copula density cop a b (u, v) is consequently defined by 



CO? b (ll,V) 



a 2 



dudv 



Cop afe (w, v) 



(5) 



As the quantile functions F 1 are scale free, the copula does 
not depend on the underlying marginal distributions. It only 



Preprint submitted to Elsevier 



January 11, 2013 



contains the pure statistical dependence. Thus, by obtaining 
the appropriate copula of a system, one can simply interchange 
the marginal distributions without any changes in the copula. 
This is very useful if the marginal distributions change for some 
reason, but the statistical dependence remains the same. We can 
rebuild the joint cumulative distribution from the copula and the 
individual distributions by 



F a ,b(x,y) = Cop 6 (F a (x), F b (y)) . 



(6) 



3. Average copula 

To calculate the cumulative copula from empirical data of 
two return time series r\ and r-i, we use 



1 T 

Cop r] >n (M, v) = - Yj lufaW) x l v fo(0) , 



(7) 



where T is the length of the time series, ly and ly are indicator 
functions relating to the sets 



U = [x | x < F-\u)} , 
V = {y | y < /^(v)) . 



The quantile function F 1 on empirical data is given by 



f:\u) 



I inf {x | F\(x) > u) 
lsup{* I F\{x) = u) 



< u < 1 
u = 



(8) 
(9) 



(10) 



and analogously for r^- We define F\(x) empirically as the per- 
centage of the portion that is smaller or equal to x compared 
to the total amount of values. When calculating the empirical 
copula density, it is useful to first define a resolution of the 2D 
grid, e.g. m = 50. On this m x m grid, we can calculate the 
copula by 



cop™ (-. -) = I £ iu,(n(0) 



Xlv f (r 2 (0) ijel 



with 



Ui = ix\F[' 



i- 1 



< x < f: 



(11) 

(12) 
(13) 



Of course, an accurate estimation of the copula density requires 
a large amount of data points. Thus, we estimate the average 
copula using intraday data. We start with the calculation of 30- 
minute arithmetic returns, because market microstructure dis- 
tortions dominate at smaller return intervals lfT2l [T3l fT4ll . We 
expand our analysis further by calculating 1-hour, 2-hour and 
4-hour returns. We obtain a very similar copula for all return 
intervals. This is very surprising, because it is well-known that 
the shape of the marginal return distribution changes towards 




Figure 1 : Average pairwise copula of the S&P 500 stock returns 
in 2007-2010. The z-axis in (a) and the isolines in (b) are in 
permille. The color shading in (a) illustrates the difference to 
the Gaussian copula (positive values mean that Gaussian copula 
is less dense). 
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Figure 2: Average pairwise copula of the S&P 500 stock returns 
in during the crisis period from 2008/10/15 to 2009/4/1. 



small return intervals - the tails of the distributions become 
stronger |[T5"l[T6l . However, apparently this does not change the 
statistical dependence. The results are shown in figure[T| exem- 
plarily for 1-hour returns. The copula has high density in the 
outer quantiles. This corresponds to a higher correlation in the 
tails of the return distribution than in it's center. This is often 
referred to as tail dependence IfTTl [181 [T9l . Our results indicate 
that on average, the upper tail dependence is stronger than the 
lower tail dependence. For comparison, the average difference 
to the Gaussion copula (which is implied by many correlation 
coefficients) is illustrated in figure la The (standard normal) 
Gaussian Copula is given by 



Cop c (w, v) 
cop c (M, v) 



F c {F-\u\F-\v)), 
f c (F-\u),F-\v)) 
f{F-\u))f{F-Hv)) 



(14) 
(15) 



Here, f c and F c refer to the bivariate standard normal probabil- 
ity density and cumulative distribution with correlation c. f is 
the univariate standard normal probability density, while F~ l is 
the corresponding quantile function. To calculate the average 
difference d, we have to calculate the Gaussian copula based on 
all coefficients of the correlation matrix C, based on K = 428 
stocks and subtract it from the empirical copula, 



d(u, v) = 



K K 

L E ( cop y (M, v) - cop c (h,v)J 
i=i j=i+i 

K(K - l)/2 



(16) 



This gives us information about how erroneous the dependence 
is estimated if implying a Gaussian copula. The empirical cop- 
ula exhibits a stronger dependence than the Gaussian copula. 
The probability of correlated extreme events is underestimated. 
The lower tail dependence is stronger than the upper tail depen- 
dence. There is general a trend that this behavior is more pro- 
nounced towards large return intervals. This might be caused by 



a more severe reaction on bad news than on good news. We will 
discuss this in more detail in the next section. Another feature 
of the empirical copula is the relatively high density in the (0,1) 
and (1,0) corners, indicating the presence of anti -correlated ex- 
treme events. 

Figure [2] illustrates the copula during the market meltdown 
between 2008 and 2009. Surprisingly it exhibits a stronger pos- 
itive tail dependence than negative tail dependence. However, 
the main observation is much higher here. The assumption of 
the Gaussian Copula would have been a dramatic mistake dur- 
ing this period. The Gaussian copula is even being discussed 
for having a main impact of the financial crisis l20ll . 

4. Dynamics of the copula 

It is evident that statistical dependencies of financial assets 
change in time. For example, this can be caused by microeco- 
nomic influences, changing political factors or herding effects. 
Several studies address this issue with the concept of correla- 
tion coefficients lT2Tll22ll23ll24l . Here, we approach this matter 
with an empirical study of the changes in the average pairwise 
copula. We calculate the average copula within 2-week peri- 
ods within the 2007-2010 period based on 1-hour returns. Re- 
sults are shown in figure [3] To illustrate the structural changes 
of the copula, we plot the isosurfaces in the tail regions. We 
discover that the tail dependence is stronger during financial 
crashes, such as from Oct 2008 to Feb 2010. But the fluctua- 
tions of the tail dependence are very large. It reflects the current 
market's situation in a sensible manner. 

Often financial crashes are accompanied by overall very 
large correlation coefficients. This raises the question if there 
is some dependence between the market's average correlation 
level and the tail dependence. To obtain an insight into this 
question we compare the average correlation coefficient of the 
whole market in each 2-week period to the tail dependence. As 
correlation coefficients are still widely used, this maps a corre- 
lation coefficient to one of the most important features of the 
copula. 

To quantify this tail dependence, we calculate the probabil- 
ity of two returns to be simultaneously above or below a certain 
quantile a. This very simple form of a upper and lower tail 
dependence coefficient is given by 



Ai(a) = Cop(a, a) , 

A u (a) = 1 - Cop(l - a, 1 - a) 



(17) 
(18) 



More advanced tail dependences are, e.g., discussed in Ref. 
1 19 1 . However, as we only examine the difference between the 
empirical copula and the Gaussian copula, we restrict ourselves 
to this measure. We perform the analysis for return intervals 
from 30 minutes to two hours. Results are shown in figure [4] 
We find a very strong relation of the tail dependence and the 
average correlation coefficient. For comparison we build the 
average tail dependence coefficients A/ and A u of the Gaussian 
copula, given by 



Ai = A„ = Cop c (a, a) 



(19) 
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Figure 3: Evolution of the S&P 500 stocks' average pairwise copula density. The isosurfaces correspond to a probability of 
0.1%e (blue) and 0.05 %c (red). The density in the tails is very high. 



To calculate the average Gaussian tail dependence, for each 2- 
week period, we calculate the tail dependence of the Gaussian 
copula based on the correlation matrix' entries Cy of this pe- 
riod, 



K K 



Z I (Co Vq . (a, aj) 



i=i j=i+\ 



K(K - l)/2 



(20) 



This gives the opportunity to compare how the tail dependence 
is overall misjudged, if using correlation coefficients or the Gaus- 
sian copula. 

The relation between the market's average correlation level 
and the tail dependence appears to be almost linear. This is 
similar to the Gaussian copula except that the tail dependence 
is more pronounced. For small return intervals, such as At = 
30min and 60min, the tail dependence has a tendency to be 
stronger than in the Gaussian case. For small quantiles, such 
as a = 2% and 4%, there are many cases where this linear rela- 
tion does not hold. There are many outliers that feature a much 
stronger tail dependence than in the Gaussian case. On larger 



return intervals, the tail dependence becomes more and more 
similar to the Gaussian case, which is consistent with studies of 
the marginal distributions lfT5ll . Here, the lower tail dependence 
is significantly higher than the upper tail dependence, as dis- 
cussed in the previous section. This underlines the unsuitability 
of the Gaussian copula for the estimation of correlated extreme 
events. This is a key ingredient to the estimation of financial 
risk I 



5. Conclusion 

In a large scale empirical study of the S&P 500 stock's cop- 
ula, we disclosed important features of the dependence struc- 
ture. This gives the opportunity to isolate the statistical depen- 
dence structure from features of the probability distributions, 
such as heavy tails. In general, the overall average pairwise cop- 
ula of the 4-year feature stronger tails than the Gaussian cop- 
ula. Extreme events are much more correlated than assumed by 
a linear correlation. Moreover, empirical copula indicates the 
presence of anti-correlated extreme events. Despite the large 
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jure 4: Relation between tail dependence and average correlation level for different quantiles a and return intervals Af. 
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differences between the Gaussian marginal distribution and the 
distribution of high frequency returns, the dependency struc- 
ture is quite similar. In a more detailed study, where we calcu- 
lated the time-dependent empirical copula in the resolution of 
2-weeks we showed that the Gaussian copula, in particular, sys- 
tematically underestimates the negative tail dependence: The 
market reacts sensible to large negative returns resulting in a 
collective downward motion. The evolution of the copula in the 
4-year period discloses a strong relation between the market's 
average correlation level and the tail dependence. For return in- 
tervals of 4 hours and in the center region of the distribution, the 
Gaussian copula describes the situation fairy well. But when 
using smaller return intervals or estimating the tail regions, the 
fluctuations in the correlation-tail-dependence relation become 
very strong. 
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